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ABSTRACT 

Database  management  systems  have  been  developed  mainly  for  commercial  and  business  pro- 
cessing. As  such,  it  generally  manages  formatted  data,  rather  than  information,  which  is  less 
structured  and  comes  in  many  forms  such  as  text,  graphics,  images,  voices,  and  signals.  Beyond 
the  capability  of  handling  multimedia  information,  non-business  application  areas  call  for  further 
requirements.  Real-time  processing,  risk  assessment,  and  partial  answers  are  some  of  the  more 
important  requirements,  especially  in  the  military  applications.  In  this  paper,  we  discuss  issues 
involved  in  creating  such  advanced  database  management  system  and  propose  some  approaches. 


I.  INTRODUCTION 

As  the  DBMS1  technology  matures,  its  applications  become  more  and  more  sophisticated  and 
diversified.  Each  new  application  generally  brings  along  new  requirements  as  well.  In  most  of 
the  past  years,  DBMS  research  and  development  was  concentrated  on  the  commercial  applica- 
tions where  the  data  is  usually  formatted.  In  the  newer  applications,  such  as  office  automation 
and  engineering,  data  come  in  both  formatted  and  unformatted  forms,  including  text,  graphics, 
and  images.  In  some  of  these  more  advanced  applications,  one  can  expect  data  to  be  in  voice  and 
signal  forms  as  well.  To  be  able  to  handle  the  data  in  the  unformatted  forms  increases  the  com- 
plexity. To  be  able  to  store  the  data  and  get  them  back  is  not  a  problem.  To  be  able  to  do  these 
operations  efficiently  is  a  different  matter.  Thus,  many  research  projects  have  been  staned  to 
address  the  new  problems  [BAT085,  CHOC81,  CHOC84,  CHRI86a,  CHRI86b,  CROW87, 
DADA86,  LUM82,  MADI87b,  MADI87c,  MASU87,  STON87,  WOEL86].  Compared  to  the 
DBMS  used  in  the  commercial  systems,  these  new  research  projects  intend  to  incorporate  into 
the  DBMS  many  functions  that  are  not  generally  considered  necessary  or  even  useful  in  com- 
mercial systems. 

Data  in  the  commercial  applications  generally  are  short,  from  few  bytes  to  few  hundred  bytes. 
Unformatted  data,  on  the  other  hand,  generally  are  long,  up  to  millions  of  bytes.  If  we  were  to 
only  store  and  retrieve  one  particular  unformatted  data  string,  we  would  not  have  much  of  a 
problem,  even  when  the  data  is  long.  We  can  simply  store  the  data  as  one  does  for  a  file,  and 
retrieve  it  as  a  file.  Indeed,  this  is  the  normal  way  that  one  does  with  the  graphics  and  image  data 
in  the  engineering  applications.  However,  to  do  it  this  way  means  one  would  not  be  gaining 
many  of  the  benefits  provided  by  the  DBMS  and  their  technology.  The  DBMS  here  merely 


text. 


1  The  term  DBMS  will  be  used  throughout  the  paper  as  both  singular  and  plural  depending  on  con- 


behaves  like  a  file  access  method.  To  gain  the  other  benefits,  we  must  provide  capabilities  to  pro- 
cess unformatted  data  in  a  similar  that  we  can  do  for  the  formatted  data.  Currently,  we  are  far 
from  this  goal. 

Indeed,  in  commercial  processing,  not  only  the  data  handled  by  the  DBMS  is  short  and  format- 
ted, but  the  structure  of  the  information  is  also  simple.  It  is  this  simplicity  that  allows  us  to  use 
tables,  as  one  does  with  the  relational  model,  to  store  and  process  the  data  effectively.  As  we 
learned  later  in  the  newer  applications,  we  found  that  we  must  have  a  more  powerful  data  model 
to  allow  us  to  capture  the  necessary  ingredients  presented  in  these  more  complex  applications. 
While  we  can  still  use  a  relational  DBMS  to  construct  the  complex  objects,  it  is  generally  agreed 
that  simple  enhancement  to  the  relational  systems  is  not  sufficient.  We  shall  discuss  this  point 
further  in  the  section  on  data  model. 

When  DBMS  technology  was  developing,  we  did  not  forsee  the  broad  use  of  this  technology. 
We  built  the  systems  with  operations  that  would  handle  the  simple  structures.  As  we  cannot 
expect  the  different  operations  that  will  be  needed  for  the  different  complex  objects  to  be  imple- 
mented, one  cannot  create  a  DBMS  that  would  provide  efficient  operations  for  these  unknown 
objects.  If  one  just  relies  on  the  normal  DBMS  operations  to  process  all  the  complex  objects,  one 
will  not  only  encounter  inefficiency  in  processing  the  data,  but  one  will  likely  lose  the  advantage 
of  the  use  of  high  level  language  interface  provided  by  a  DBMS. 

Consider,  for  example,  that  one  wants  to  use  a  DBMS  to  manage  the  data  in  engineering  designs. 
Suppose  one  would  like  to  display  a  car  in  a  graphical  form,  but  inside  the  system,  the  data  is 
stored  in  geometric  parameters,  not  just  a  simple  bit  string.  The  user  would  have  a  hard  time  to 
use  the  DBMS  to  rotate  the  view  of  the  car  using  only  the  DBMS  operations.  What  one  has  to  do 
is  to  write  programs  to  work  with  the  data  received  from  the  DBMS  and  transform  them  to  a 


suitable  form.  One  naturally  thinks  that  these  operations  should  be  defined  and  used  in  such  a 
way  as  if  the  DBMS  operations  have  been  extended  to  have  these  new  functions.  Such  a  concept 
is  referred  to  as  abstract  data  type  in  the  programming  language  environment  [GUTT78, 
LISK75,  SHAW80].  The  same  concept  can  be  utilized  in  the  DBMS  environment  as  well. 

Related  directly  to  the  issues  just  mentioned  is  the  user  interface.  While  much  has  been  done  in 
this  area,  much  remains  to  be  done.  One  reason  for  this  is  that  more  and  more  people  are  using 
DBMS  to  process  data.  The  users  today  range  from  the  very  sophisticated  to  the  very  naive.  Yet 
we  have  to  accommodate  all  of  them.  Whether  one  should  eventually  have  only  one  user  inter- 
face or  have  many  is  still  unclear.  Nevertheless,  the  issue  should  be  carefiully  studied. 

In  fact,  we  really  cannot  expect  to  know  what  will  be  coming  in  the  future  in  terms  of  applica- 
tions. Neither  can  we  see  what  will  be  the  development  of  different  technologies.  We  do  know, 
however,  that  we  want  the  computer  to  do  more  and  more  of  our  work.  We  want  to  have  the 
computer  to  do  searching  and  processing  better  than  we,  the  humans,  can.  Our  first  step  is  to 
approximate  ourselves.  At  least,  we  do  this  when  we  do  not  know  how  else  to  do.  This  means 
that  we  will  have  in  the  future  more  and  more  AI  techniques  incorporated  into  DBMS. 

To  achieve  this,  it  means  we  must  have  an  architecture  of  a  DBMS  to  be  so  structured  that  new 
techniques  can  be  'plugged  into'  an  existing  system  in  some  way.  Such  a  system  is  currentiy 
called  extensible  system  and  is  being  studied  in  many  places  [BAT087a,  BAT087b,  LIND86, 
McPH87,  OREN86,  SHCA86,  STON87].  We  would  like  to  further  extend  the  limited  goals  of 
some  of  the  studies  to  include  not  only  some  simple  additional  functions  that  are  allowed  to 
include  into  a  system,  but  also  sophiscated  techniques  like  AI,  signal  processing,  voice  and 
sound  processing,  etc. 

Users  of  any  computer  system  are  always  concerned  about  the  efficiency  of  the  system.  Thus, 
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creators  of  DBMS  have  always  tried  to  see  how  high  performance  can  be  achieved.  Today's 
DBMS  in  general  give  good  performance.  In  fact,  in  some  cases,  they  support  extremely  high 
transaction  rates.  However,  we  have  always  been  concerned  only  with  commercial  applications. 
There  are  other  applications  that  have  not  been  traditionally  our  concerns.  For  example,  we 
know  of  banking  applications,  where  the  transaction  volume  is  high,  and  customers  are  waiting 
in  line  at  the  terminals.  We  say  that  we  want  to  create  a  real-time  system  so  that  customers 
would  not  become  upset  for  waiting  too  long.  Here,  we  mean  'real-time'  to  be  10,  20,  or  30 
seconds  or  more.  Such  timing  would  be  quite  satisfactory,  as  people  can  not  react  much  faster. 

However,  in  some  advanced  military  applications,  this  kind  of  measure  is  totally  unacceptable. 
For  example,  if  the  DBMS  is  used  to  store  data  for  identifying  objects,  one  may  be  forced  to  do 
split  second  decisions  when  one  sees  an  object  approaching.  The  same  would  be  true  if  the  com- 
mander of  a  battle  group  of  ships  wants  to  see  how  he  would  want  to  react  to  a  battle  situation. 
He  does  not  always  have  the  luxury  to  wait  half  a  minute  just  to  see  what  may  come  out  from  the 
computer. 

Under  situations  like  these,  we  need  to  have  a  DBMS  that  not  only  can  handle  multimedia  infor- 
mation, but  alway  be  able  to  react  flexibly.  In  urgent  cases,  the  response  must  be  exceedingly 
rapid.  At  other  times,  one  can  afford  to  wait  a  while.  How  can  we  achieve  such  a  goal?  We  must 
start  from  the  beginning  to  design  a  DBMS  that  can  do  things  this  way.  We  do  have  a  number  of 
things  that  we  must  consider.  First,  we  must  see  how  to  separate  transactions  that  are  extremely 
urgent  from  the  ordinary  ones.  We  have  to  store  data  and  information  that  would  be  useful  for 
this  purpose.  We  most  likely  have  to  add  special  data  structures  that  would  be  useful  for  this  pur- 
pose. 

In  fact,  it  is  definitely  useful  to  see  how  one  can  assign  risks  to  the  transactions.  That  is,  if  we 


fail  to  response  to  a  query,  what  may  be  the  consequence.  If  the  case  is  between  life  and  death, 
as  would  be  the  case  in  some  combat  situations,  one  must  deliver  the  answer  fast,  extemely  fast, 
even  if  the  system  has  not  gotten  the  complete  answer.  This  approach  would  be  contrary  to  the 
current  practice  in  DBMS  applications.  Today,  a  DBMS  would  either  give  a  final  answer  or 
nothing  at  all.  It  will  search  its  whole  database  to  find  the  definite  answer.  It  will  not  say,  'this  is 
the  likely  answer'. 

This  is  not  the  way  people  do  things.  Most  of  the  time,  people  can  only  find  approximate  and 
incomplete  answers  and  do  the  best  we  can  with  them.  Even  in  AI  studies,  most  researchers  still 
try  to  work  with  the  approach  that  will  give  only  complete  and  definite  answers.  We  think  that  in 
many  applications,  maybe  the  probabilistic  or  heuristic  approach  would  be  sufficient.  In  fact,  if 
the  data  is  massive  and  the  urgency  is  acute,  this  may  be  the  only  alternative  other  than  giving 
nothing  at  all. 

To  include  probabilities  for  heuristic  search  in  the  stored  information  is  difficult.  This  happens 
because  we  cannot  forsee  all  the  applications  or  the  queries.  However,  in  many  cases,  we  can 
use  the  statistical  approach  such  as  sampling  to  assess  data  to  derive  information.  To  do  so 
would  require  additional  functions  and  structures  in  the  DBMS.  We  would  also  have  to  develop 
different  strategies  to  see  how  we  want  to  handle  the  different  queries. 

One  other  situation  that  we  feel  important,  but  has  traditionally  been  ignored,  is  that  in  many 
situations  we  cannot  limit  our  system  to  be  homogeneous.  For  example,  it  is  not  feasible  for  the 
whole  government  and  the  armed  forces  to  possess  DBMS  of  one  kind  or  use  only  one  data 
model  in  all  the  systems.  The  most  likely  scenario  is  that  there  will  be  many  systems  with  dif- 
ferent data  models,  but  information  is  to  be  derived  from  these  diverse  systems  and  assembled  in 
one  of  them  for  presentation.  We  should  develop  capabilities  that  would  allow  us  to  handle  this 


kind  of  situation. 

We  would  like  to  point  out  that  the  purpose  of  the  report  is  to  identify  issues  and  scope  out  the 
major  areas  that  need  solutions  to  provide  a  foundation  to  support  the  applications  in  the  various 
Navy  programs,  specifically,  in  the  Command  Systems  Technology  Block  Program.  Naturally 
any  operational  DBMS  must  incorporate  many  of  the  solutions  already  existing  or  becoming 
available  in  order  to  have  a  complete  system.  Since  we  have  only  a  very  small  resource,  we  feel 
that  it  is  best  for  us  to  work  on  those  problems  that  are  critical  for  us  to  have  solutions  in  order  to 
proceed  further,  or  that  are  not  actively  pursued  elsewhere  in  the  DBMS  research  community. 
Consequently,  important  areas  such  as  consistency  control  ,  recovery  and  reliability  in  a  distri- 
buted DBMS  environment,  are  not  addressed  as  they  are  actively  pursued  in  the  DBMS  research 
and  development  community  because  of  their  values  in  the  commercial  applications.  We  would 
also  like  to  point  out  that,  while  we  identify  problems  that  may  be  specific  in  the  Command  Sys- 
tems Block,  it  is  believed  that  the  solutions  when  available  will  have  a  much  broader  application. 

In  the  following  sections,  we  shall  discuss  some  of  the  above  issues  in  more  details.  We  shall 
also  propose  some  approaches  in  some  cases. 


n.  DATA  MODEL 

The  abstraction  concepts  our  model  proposed  in  [MADI87a]  supports  molecular  aggregation, 
generalization,  version,  and  instantiation.  These  concepts  are  necessary  (but  probably  not 
sufficient  in  many  application  areas)  to  capture  the  semantics  involved  in  the  advanced  applica- 
tion areas.  We  add  one  more  abstraction  concept  called  medium  to  the  model  so  now  it  will  be 
able  to  describe  multimedia  objects. 


Molecular  aggregation  is  the  abstraction  of  a  set  of  objects  and  their  relationships  into  a  higher- 
level  objects  [SMIT77].  This  abstraction  allows  a  viewing  of  objects  from  different  levels  of 
generality.  At  the  executive  level,  for  example,  one  may  only  be  interested  in  tank,  while  at  the 
plant  level,  one  may  be  interested  in  the  details  of  tank,  that  it  is  an  aggregation  of  cannon,  guns, 
caterpillar  treads,  etc.  At  the  wheel  engineering  department  level,  one  is  now  interested  in  the 
details  of  a  wheel.  The  idea  is  to  give  the  user  only  the  amount  of  details  he  needs  for  a  particu- 
lar application. 

The  generalization  concept  [SMIT77]  is  used  in  our  model  to  provide  the  relationship  between 
types  and  their  subtypes.  A  molecular  object  has  a  type  and  may  consist  of  many  subtypes.  For 
example,  a  molecular  object  of  type  surface  ship  has  subtypes  carrier,  destroyer,  frigate,  etc. 
In  other  words,  a  type  is  a  generalization  of  a  set  of  named  subtypes.  Notice  that  subtypes 
inherit  their  supertype's  attributes  and  possess  sets  of  attributes  unique  to  them.  A  molecular 
object  of  type  light  cruiser  inherits  the  attributes  of  surface  ship  and  has  a  set  of  attributes 
unique  to  it  such  as  number-of-depth-charges-loaded. 

A  version  of  a  type  (or  a  subtype)  is  defined  to  be  a  molecular  object  with  interface  details  com- 
pletely specified,  but  with  implementation  details  in  some  stage  of  completion.  A  molecular 
object  surface  ship  may  have  different  versions  such  as  nuclear-powered,  diesel- powered,  etc. 
These  versions  have  the  same  interface,  i.  e.  characteristics  which  distinguish  them  as  surface 
ships.  A  subtype  of  an  object,  on  the  other  hand,  has  different  interface  detail.  A  subtype  light 
cruiser  has  an  interface  different  from  the  subtype  carrier.  The  reader  may  wonder  how  the 
nuclear-powered  could  be  a  version  of  surface  ship  instead  of  a  subtype.  Whether  one 
identifies  an  object  as  a  subtype  or  a  version  depends  on  the  conceptual  designer,  and  it  is  not  the 
role  of  data  model  to  determine.  The  reason  our  model  includes  both  subtype  and  version  con- 
cepts is  because  they  make  the  modelling  of  application  more  precise.    If  there  is  only  one 
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concept  --  either  subtype  or  version  --  then  a  subtle  ambiguity  and  a  consequent  confusion  may 
result. 

An  object  is  created  by  instantiation  [BAT085].  Both  object  types  and  object  versions  can  be 
instantiated.  A  version  will  be  instantiated  to  provide  a  local  working  copy  of  a  previous  design, 
which  has  the  implementation  details  specified.  A  type  will  be  instantiated  to  provide  a  working 
copy  for  design  work  starting  from  scratch,  that  is,  no  implementation  details  are  specified.  In 
case  of  a  battleship  design,  one  instantiates  the  type  battleship  to  produce  a  new  design  of  a  bat- 
tleship from  scratch.  In  other  words,  the  designer  has  to  fill  in  the  values  for  number  and  types 
of  armaments,  displacement,  etc.  If  the  designer  instantiates  the  version  Iowa-class,  then  he 
need  not  specify  the  detail  for  the  cannons,  since  they  are  already  specified.  Versioning  allows 
expedient  designing  by  reusing  the  old  design. 

We  extend  the  described  model  to  incorporate  multimedia  data.  The  notion  is  similar  to  version. 
In  the  extended  model,  we  have  another  concept  called  medium.  An  object  type  may  have 
several  media,  and  a  medium  may  be  text,  graphic,  image,  voice,  signal,  etc.  So  the  medium 
specifies  the  mode  of  transmission  in  conveying  the  information  from  a  computer  to  a  human. 

Let's  say  that  the  object  frigate  has  graphic  as  one  of  its  media.  Then  the  displaying  of  frigate 
information  in  the  graphic  medium  allows  the  user  to  see  the  visual  presentation  of  the  frigate. 
The  regular  graphics  operations  such  as  zooming,  paning,  rotation,  scaling,  etc.  can  be  sup- 
ported. Medium  actually  specifies  the  procedural  detail  of  presenting  the  informational  content. 
Now,  using  this  graphics  display  of  a  frigate,  the  user  can  point  to  the  cannon  of  the  frigate  and 
receive  information  about  it.  If  the  object  cannon  also  has  the  graphics  medium,  then  it  too  can 
be  displayed  visually  (in  a  separate  window).  Otherwise,  the  regular  information  (record-based 
information  such  as  size,  firepower,  etc.)  of  the  cannon  is  presented.   When  there  is  more  than 


one  medium,  then  the  user  is  prompted  for  the  display  medium. 

Notice  that  the  framework  we  have  presented  here  will  accomodate  a  new  search  technique  in 
various  media.  We  do  not  currendy  have  an  efficient  way  to  recognize  an  object  in  the  digitized 
image.  For  example,  given  a  digitized  image  of  a  battle  scene,  it  is  not  possible  to  identify 
objects  such  as  tanks  and  armored  vehicles  depicted  in  the  scene.  If  a  new  algorithm  to  do  the 
task  is  invented,  then  it  can  be  easily  incorporated  in  our  proposed  framework,  because  the 
media  is  nothing  but  an  abstraction  of  the  procedural  detail  of  displaying. 

Our  approach  in  handling  multimedia  objects  is  not  to  equate  media  to  type.  In  the  case  where  a 
media  is  equated  to  type,  an  object  has  a  type  graphic,  text,  etc.,  while  in  our  case  an  object  is  of 
type  carrier,  submarine,  heavy  cruiser,  etc.,  which  has  a  different  media.  We  believe  our 
approach  closer  to  the  meaning  of  "medium",  which  is  the  mode  of  transferring  the  information. 
We  are  saying  that  the  object  has  an  intrinsic  nature  (ie.  person,  ship,  automobile,  river,  etc)  and 
this  intrinsic  nature  can  be  conveyed  to  others  via  different  media. 


HI.  USER  INTERFACE 

Of  many  system  requirements  that  the  new  advanced  DBMS  must  satisfy,  simplicity  require- 
ment would  rank  among  the  most  critical  ones.  No  matter  how  successful  we  are  in  meeting  the 
other  requirements,  if  the  system  is  not  simple  to  learn  and  use,  no  one  would  even  bother  trying 
to  use  the  system.  We  believe  a  key  to  simplicity  is  a  unified  user  interface.  We  would  like  to 
have  a  single,  coherent  interaction  method  for  the  manipulation  of  information,  be  it  an  informa- 
tion for  marketing  analysis,  personnel  data,  or  graphical  display  of  design  objects.   Since  the 
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types  of  information  manipulated  in  the  advanced  applications  are  complex,  we  feel  an  extension 
of  a  conventional  textual  language  would  be  inappropriate.  We  believe  the  graphical  interface 
approach  has  the  most  potential  for  becoming  a  unified  interaction  method  that  can  be  used  for 
diverse  application  areas.  Other  approaches  such  as  the  natural  language  interface  and 
semantic-based  interface  (a  textual  interface  based  on  semantic  data  models)  are  simply  not 
compatible  with  the  way  users  normally  function  in  carrying  out  the  engineering  and  business 
activities.  Many  graphical  interfaces  are  already  proposed  in  the  literature.  They  can  be 
classified  broadly  into  four  groups: 

a)  manipulating  business  forms  and  supporting  office  automation  [LUM82,  ROWE85,  SHU85, 
SMIT82,  ZL0082]; 

b)  querying  a  database  based  on  relational  or  semantic  data  model  [BRYC86,  ELMA85, 
GOLD85,  WONG82,  ZL0077]; 

c)  managing  system  resources  [GOLD83,  WTLL84];  and 

d)  developing  general-purpose  programs  [GLIN84,  RAED84]. 

We  think  it  is  possible  to  create  a  unified  graphical  interface  by  extending/combining/modifying 
these  proposed  methods. 

We  proposed  a  graphics  interface  to  the  database  and  reported  in  [WU87,  WU86].  Since  we  are 
dealing  with  multimedia  data,  the  graphics  mode  of  interaction  seems  to  be  most  natural  and 
effective.  Our  proposed  interface  provides  a  graphical  representation  for  the  generalization, 
aggregation,  classification,  and  association  concepts.  We  still  need  to  add  the  notion  of  version 
and  medium  to  the  proposed  interface  to  make  it  complete. 
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Our  proprosed  graphics  user  interface  GLAD  utilizes  a  bit-mapped,  high-resolution  graphics 
display  terminal.  The  screen  consists  of  two  types  of  windows:  schema  and  operation  window. 
In  the  schema  window,  GLAD  provides  an  elegant  visual  representation  of  real  world  abstrac- 
tion concepts  most  semantic  data  models  support:  aggregation,  generalization,  classification,  and 
association.  In  the  operation  windows,  GLAD  describes  objects,  displays  results,  and  allows 
users  to  specify  queries.  Windows  can  be  opened,  closed,  scaled,  and  moved  at  the  user's  will. 
The  horizontal  and  vertical  elevators  are  used  to  pan  different  portions  of  a  display  when  the 
whole  display  is  too  large  to  fit  in  a  window. 


IV.  SYSTEM  ARCHITECTURE 

As  mentioned  in  the  Introduction,  because  it  is  not  possible  to  anticipate  the  advances  in  technol- 
ogy in  the  future,  it  is  important  to  design  a  system  architecture  that  would  allow  us  to  incor- 
porate new  techniques  into  the  system  as  they  are  developed.  Figure  1  is  a  schematic  representa- 
tion of  the  system  in  which  we  have  outlined  some  of  the  key  components.  The  Supervisor  com- 
ponent, as  the  name  implies,  is  the  module  that  oversees  and  coordinates  all  the  activities  of  a 
transaction.  It  invokes  other  components  to  do  a  specific  task  and  assembles  the  information  it 
receives  from  them.  The  Data  Access  component,  in  addition  to  dealing  with  the  data  from  and 
to  the  storage  units,  actually  is  assumed  to  contain  many  service  components  to  be  invoked  by 
the  other  processing  components.  For  example,  the  locking  manager,  the  transaction  manager, 
the  recovery  unit,  etc.  are  components  in  this  box,  although  they  are  not  explicitly  illustrated  in 
the  diagram.  The  other  components  like  the  Formatted  Data  Processing  unit,  the  Text  Processing 
Unit,  etc.  are  specilized  components  that  would  perform  designated  tasks. 
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In  essence,  compared  to  today's  DBMS,  the  components  like  the  Text  Processing,  the  Graphics 
and  Image  Processing,  etc.  are  generally  nonexistent.  Although  conceptually  that  is  all,  actually 
there  is  a  major  difference  between  the  proposed  system  and  current  DBMS.  The  difference  is 
that  current  DBMS  are  not  designed  to  allow  its  components  replaced.  As  such,  the  modules' 
interfaces  are  not  so  clearly  defined.  Here,  it  is  of  paramount  importance  that  we  have  very  well 
and  very  clearly  defined  component  interfaces  so  that  each  component  as  indicated  in  the 
diagram  would  be  replaceable  by  another  one  in  its  place.  This  is  by  no  means  an  easy  task.  In 
fact,  one  may  question  whether  that  is  at  all  possible.  There  is  substantial  ground  for  this  skepti- 
cism, particularly  when  we  consider  the  complexity  that  may  be  contained  in  the  single  box 
called  Formatted  Data  Processing.  This  one  box  represents  a  great  part  of  today's  DBMS. 

While  this  question  cannot  be  answered  at  this  time,  it  certainly  is  an  area  where  research  can  be 
and  should  be  conducted.  Further,  not  all  the  processing  boxes  represent  the  same  complexity. 
For  example,  the  Text  Processing  component  is  likely  to  be  much  simpler  than  the  Graphics  and 
Image  Processing  unit.  In  fact,  we  may  even  want  to  split  this  latter  component  into  two:  one  for 
graphics  and  another  one  for  images.  The  complexity  of  the  units  depend  on  the  complexity  of 
the  structure  of  the  information  represented  by  the  data.  Images  are  definitely  more  complex  than 
text  or  graphics,  as  it  does  not  contain  regularity  as  in  the  others.  It  is  too  early  for  us  to  say  what 
may  or  may  not  be  possible  to  achieve  at  this  time.  We  only  wish  to  point  out  that,  if  we  do  not 
consider  this  factor,  we  will  run  the  risk  that  our  system  might  become  outdated  by  the  time  we 
have  implemented  a  prototype.  It  will  also  mean  that  we  will  forever  be  chasing  the  advances  in 
technology  but  never  can  we  catch  up. 

Pursuing  the  same  philosophy,  it  actually  means  that  we  should  further  divide  the  major  com- 
ponents into  subcomponents  in  such  a  way  that  would  allow  us  to  plug  into  the  system  replace- 
ment subcomponents.  Indeed,  an  ideal  is  to  have  a  system  with  many  parts  so  designed  that  they 
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can  be  replaced  at  will  without  disturbing  the  rest  of  the  system. 


V.  INTEGRATION  WITH  ADVANCED  TECHNIQUES 

In  the  last  section,  we  discussed  the  importance  of  having  an  architecture  in  the  system  that  we 
can  plug  in  new  components  when  new  techniques  are  developed.  Let  us  consider  here  more 
specifically  some  of  the  new  techniques  that  may  come  in  the  future. 

The  first  thing  that  comes  to  mind  is  probably  the  application  of  artificial  intelligence  (AI)  in 
DBMS.  AI,  by  definition,  is  the  area  of  research  where  one  defines  techniques  that  allow  us  to 
analyze  and  find  solutions  to  problems  as  done  by  people.  This  encompasses  the  representation 
of  information  and  knowledge,  the  storage  and  retrieval  of  the  information,  the  reasoning  or 
rationalization  process,  and  all  the  other  things  that  get  involved  in  the  process  to  get  to  an 
answer.  As  such,  we  can  see  that  the  area  has  been  around  as  long  as  the  existence  of  civilization 
itself.  Given  that  this  is  the  case,  we  can  see  that  it  is  not  possible  to  expect  that  we  will  have  all 
the  answers  from  AI  researchers  soon  in  the  future. 

Yet,  the  tie  between  database  techniques  and  AI,  in  our  view,  is  very  close  and  strong.  We  see 
that  when  a  solution  to  an  AI  approach  is  not  known,  researchers  in  AI  will  continue  to  pursue  it 
until  it  is  solved.  However,  once  an  AI  technique  becomes  clearly  defined,  then  the  database 
technologists  will  want  to  incorporate  that  technique  into  a  DBMS.  The  database  researchers  are 
system  builders  who  constantly  strive  to  incorporate  new  things  into  their  systems  to  solve 
broader  problems.  After  all,  database  researchers  always  have  to  find  answers  to  queries  posed 
by  people.  Since  the  power  of  the  high  level  queries  are  practically  capable  of  expressing  most 
of  the  questions  we  can  form,  even  when  they  are  complex,  database  researchers  in  reality  would 
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want  to  incorporate  AI  techniques  to  their  systems  if  these  techniques  are  found  to  be  useful  for 
them  to  answer  queries. 

If  this  view  is  correct,  then  the  incorporation  of  AI  techniques,  or  their  derivatives,  would  be  just 
a  matter  of  timing.  This  closeness  between  the  two  areas  makes  it  most  appropriate  that  one 
should  design  DBMS  that  tightly  couple  techniques  from  these  two  areas  together.  Thus,  the  pro- 
posals that  want  to  build  additional  layers  on  top  of  a  DBMS  to  allow  one  to  use  a  new  AI  tech- 
nique is  not  a  good  way  to  build  a  system.  This  approach  does  not  let  us  build  the  system  as  one 
piece  and  would  cause  us  to  lose  performance.  Most  likely,  many  things  will  be  done  more  than 
once  as  one  layer  would  not  know  what  to  expect  from  the  other. 

What  can  be  expected  from  the  advance  of  AI  techniques  is  not  possible  to  guess.  However,  if 
we  can  separate  the  function  that  applies  AI  techniques  to  query  processing  into  isolated  com- 
ponents, we  can  then  continue  to  include  the  new  techniques  as  they  are  developed.  One  should 
note,  however,  when  we  say  an  AI  component,  we  do  not  really  mean  that  there  is  only  one  pro- 
gram or  one  group  of  programs.  In  fact,  the  AI  box  actually  represents  many,  many  smaller 
boxes.  For  example,  we  can  see  that,  within  this  AI  box,  there  may  be  boxes  that  process  rules  or 
knowledge  bases.  There  may  be  a  box  to  assess  and  assign  risks,  probabilities,  or  to  interpolate 
statistics.  There  may  be  boxes  to  process  images,  or  many,  many  other  things.  While  we  do  not 
know  exactly  how  many  things  we  should  have  here,  we  would  know  how  to  deal  with  the  addi- 
tional ones  if  we  know  how  to  deal  with  some  of  them.  Thus,  we  shall  concentrate  on  only  few 
of  the  well  known  ones  in  our  research  at  this  time. 

Let  us  look  at  specific  examples  to  illustrate  our  point.  In  the  processing  of  formatted  data,  we 
know  fairly  well  how  to  handle  things.  We  have  good  data  structures  that  allow  us  to  process 
data  efficiently.  If  a  customer  of  a  bank,  for  instance,  asks  for  the  balance  of  his  account,  the  sys- 
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tern  knows  how  to  get  that  answer  without  doing  an  exhaustive  search,  in  general,  because  a 
DBMS  usually  has  indexing  methods  for  that  purpose.  In  cases  like  this,  we  can  actually  find  not 
just  an  answer,  but  a  precise  and  exact  answer.  Let  us  consider  what  happens  in  the  case  for 
unformatted  data. 

First,  let  us  look  at  the  simple  extension  from  formatted  data  to  text  data  such  as  the  one  in  a 
technical  paper  or  in  a  written  report.  While  many  researchers  have  looked  into  this  problem,  no 
good  solutions  have  been  found.  There  are  solutions,  though.  For  example,  if  we  want  to  know 
what  documents  contain  information  on  foreign  policy  on  Russia,  we  have  difficulty  getting  a 
precise  answer,  as  we  have  many  alternatives  to  go  about  doing  the  query  search  in  this  case.  For 
example,  do  we  look  for  documents  containing  exact  phrases  as  indicated  here?  Do  we  need  a 
document  containing  words  that  match  the  phrase  here?  Or  do  we  want  documents  dealing  with 
the  general  meaning  of  the  phrase  as  stated?  In  each  of  these  cases,  we  will  get  a  different 
answer.  It  can  be  much  harder  in  one  way  than  another.  Effective  and  efficient  retrieval  is 
difficult  because  we  are  now  dealing  with  information  rather  than  just  raw  data.  Information  and 
knowledge  is  generally  imprecise.  Answers  to  queries  on  them  are  therefore  generally  not  pre- 
cise either. 

When  we  extend  from  text  into  even  more  unformatted  data,  we  have  much  more  difficulty.  Let 
us  consider  image  data  as  an  example.  Assume  we  have  stored  in  our  system  all  the  photographs 
we  have  taken.  Now,  suppose  we  want  to  find  pictures  that  contain  at  least  one  destroyer.  How 
are  we  to  know  which  pictures  contain  destroyer?  We  first  have  to  know  what  a  destoyer  sup- 
posed to  look  like.  But  that  may  not  be  sufficient.  We  need  to  know  how  to  recognize  a  des- 
troyer in  all  the  different  positons  and  angles  and  in  groups,  if  we  wish  to  be  sure  that  we  get  the 
correct  answer,  namely  all  the  pictures  with  at  least  one  destoyer.  This  is  indeed  AI  research. 
Today,  AI  techniques  have  not  advanced  to  the  state  that  would  allow  us  to  process  queries  of 
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this  kind.  If  such  techniques  were  available,  there  is  no  question  that  they  should  be  included  in  a 
database  system  to  be  used  to  process  these  queries  in  the  same  way  that  we  do  for  the  formatted 
data  queries. 

However,  even  in  the  best  of  the  cases,  we  will  not  be  able  to  get  answers  to  queries  as  nicely  as 
we  do  in  the  formatted  data.  In  formatted  data  processing,  we  have  precise  answers.  In  this  kind 
of  information,  it  is  not  possible  to  expect  the  same  kind  of  service.  In  real  life,  if  we  have  a  per- 
son looking  at  the  pictures,  there  may  be  pictures  that  are  not  recognized  by  the  person  because 
of  various  reasons.  While  it  may  be  possible  for  the  computer  to  do  a  better  job  than  a  person,  it 
would  be  a  long  time  before  the  technology  would  advance  to  that  state  in  this  case. 

There  are,  of  course,  other  solutions.  We  can  include  a  description  such  as  keywords  to  each  pic- 
ture in  the  file.  We  can  then  search  the  key  words  instead.  This  approach,  in  fact,  is  the  one 
adopted  even  in  representing  text  documents.  While  we  can  accept  solutions  like  this,  it  is 
definitely  not  the  best  kind.  If  the  person  who  does  the  classification  has  forgotten  to  enter  a 
descriptor,  the  system  will  never  be  able  to  get  that  particular  document  or  picture.  Since  no  one 
can  think  of  all  the  descriptors  pertinent  to  queries  that  are  not  yet  defined,  it  is  a  sure  thing  that 
many  pictures  will  not  be  found  by  the  system  even  when  they  are  the  ones  desired  by  the  user 
who  enters  the  query. 

Further,  such  technique  is  not  so  useful  for  the  more  complex  queries.  For  example,  suppose  one 
wishes  to  find  the  pictures  that  contain  a  carrier  following  a  destroyer  in  a  battle  group  forma- 
tion. One  would  now  have  to  recognize  that  certain  motions  imply  something  or  other.  For 
example,  even  though  both  a  carrier  and  a  destroyer  are  detected,  if  the  carrier  is  in  front,  then 
this  picture  may  be  not  the  one  for  this  query.  What  we  want  to  say  is  this:  In  processing  infor- 
mation, one  needs  advances  in  many  fields,  especiflcally  in  AI.  Today,  we  use  DBMS  mainly  to 


17 


manage  formatted  data.  Actually,  we  want  to  develop  systems  that  would  manage  information, 
which  comes  to  us  in  many  forms.  However,  the  state  of  the  art  is  such  that  we  do  not  know 
how  to  handle  the  unformatted  data  well.  Nevertheless,  we  should  design  DBMS  in  such  a  way 
that  allow  us  to  include  the  advances  in  technology  in  other  fields. 

The  above  argument  can  easily  be  extended  to  include  voice  prcessing  or  signal  processing.  In 
commercial  data  processing,  signal  processing  is  not  very  useful.  In  military  data  processing, 
signal  processing  is  important.  We  shall  not  belabor  the  necessity  of  including  signal  processing 
in  a  DBMS,  as  there  are  numerous  examples  of  its  need  in  military  data  processing.  In  that 
environment,  information  is  stored  and  derived  from  many  forms,  frequently  in  signals.  In  cer- 
tain cases,  the  signals  are  transformed.  For  example,  the  noise  from  the  engine  of  a  submarine 
will  likely  be  some  graphic  waveforms  rather  than  just  as  sound  waves.  In  other  cases,  the  signal 
may  be  received  in  its  original  form.  Whatever  it  is,  the  system  may  be  required  to  store  and 
retrieve  it.  In  many  cases,  one  may  have  to  analyse  data  from  many  aspects  in  order  to  derive  an 
answer  to  a  query.  By  having  the  architecture  of  our  system  modular,  we  can  handle  this  kind  of 
processing  as  well. 

We  have  earlier  mentioned  that  risk  assessment  should  be  included  in  DBMS  for  military  appli- 
cations. Let  us  pursue  this  aspect  a  little  further. 

It  is  generally  true  that  some  queries  are  more  pressing  than  others.  In  military  situations,  the 
inability  to  complete  a  query  processing  may  have  dire  consequences.  For  example,  we  may 
have  received  a  signal  that  something  is  approaching  our  base  and  we  do  not  know  exactly  what 
it  is.  We  try  to  search  our  database  to  see  if  anything  resembling  this  signal.  If  we  do  not  finish 
processing  our  search,  numerous  lives  may  be  lost.  The  risk  is  high  in  this  case  and  it  should  be 
so  recognized. 
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One  may  think  that  we  should  classify  queries  into  priorities  and  the  system  will  handle  them 
accordingly.  This  is  a  beginning.  It  means  that  a  DBMS  should  include  at  least  priorities  in 
queries  and  shedule  and  process  them  accordingly.  It  is  not  sufficient.  We  should  go  further. 

Let  us  suppose  that  we  can  determine  the  likelihood  that  the  signal  is  not  a  threatening  one.  In 
that  case,  even  if  we  do  not  finish  processing  the  query,  there  is  not  much  of  a  danger.  On  the 
other  hand,  if,  after  some  analysis,  the  system  finds  that  the  signal  is  very  likely  to  have  a  highly 
destructive  effect,  such  result  should  be  communicated  to  the  user,  even  though  the  system  has 
not  completed  the  whole  search. 

Once  again,  it  seems  that  AI  techniques  will  be  involved.  We  think  that,  in  the  AI  box,  there 
should  be  modules  that  would  provide  assessment  on  the  risk  factor.  In  some  cases,  the  assess- 
ment may  depend  on  the  analysis  of  the  statistics  as  given  in  the  database.  Whether  one  would 
need  to  store  additional  information  in  the  system  to  use  depends  on  the  application  and  the 
algorithm  used  to  generate  an  answer.  For  example,  while  it  is  possible  to  derive  probability 
based  on  statistics,  it  is  not  possible  to  find  any  probability  if  the  statistics  and  information  were 
not  there.  For  example,  we  may  find  an  object  approaching  and  identify  the  object  to  be  hostile 
and  threatening,  but  since  the  database  does  not  contain  the  information  detailing  the  explosive 
power  of  the  object,  we  are  not  capable  to  assess  the  damage  that  may  be  caused  by  the  object. 
One  may  not  be  able  to  correctly  assign  the  risk  factor  for  this  situation. 

In  any  case,  risk  is  a  valid  factor  to  be  considered  and  the  system  should  include  some  intelli- 
gence in  it  so  that  appropriate  answer  can  be  generated. 

The  above  approach  really  argues  that  DBMS  and  AI  should  be  tightly  coupled  so  that  tech- 
niques from  the  two  areas  should  be  integrated.  While  this  view  may  not  be  shared  by  all 
researchers,  it  seems  to  us  most  appropriate.  Without  DBMS  technology,  AI  would  not  be  able 
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to  process  the  voluminous  data  in  an  organized  manner.  Without  the  use  of  some  of  the  AI  tech- 
niques, a  DBMS  will  not  be  able  to  process  intelligently  many  queries.  While  it  will  always  be 
true  that  AI  will  try  to  understand  and  find  solutions  to  many  seemingly  ill-defined  problems, 
once  a  solution  is  found,  it  would  be  appropriate  to  include  it  into  a  DBMS. 

From  the  architectural  point  of  view,  as  shown  in  Figure  1,  one  can  say  that  every  DBMS  should 
have  an  AI  component  that  allows  the  DBMS  to  answer  queries  which  require  logical  or  heuris- 
tic deductions  or  inductions.  The  DBMS,  in  this  case  will  then  be  the  superstructure  that  inter- 
faces directly  with  the  users.  Such  proposal  is  logical  as  DBMS  traditionally  deals  with  end-user 
query  interfaces.  On  the  other  hand,  it  can  also  be  stated  that,  within  every  AI  system,  there  is  a 
database  system.  As  true  to  life  applications  always  will  need  tens  of  thousands  of  rules  and  mil- 
lions of  facts  or  pieces  of  information,  a  DBMS  is  needed  to  help  manage  and  process  these  rules 
and  the  information,  freeing  the  AI  component  to  do  the  logical  or  heuristic  processing.  Whether 
it  is  one  way  or  another,  it  is  not  clear  at  this  time.  Much  research  is  needed  to  answer  this  ques- 
tion. It  is  clear  to  us,  however,  that  a  strong  coupling  or  integration  is  likely  the  direction  to 
proceed. 


VI.  PERFORMANCE  ENHANCEMENT 

As  said  before,  performance  in  military  applications  at  times  can  be  more  demanding  than  com- 
mercial processing.  The  real-time  element  here  can  be  most  critical.  While  in  commercial  appli- 
cations, a  delay  for  processing  a  query  may  cause  an  organization  a  very  large  sum  of  money,  in 
military  applications,  delay  in  response  in  some  cases  can  literally  mean  life  and  death  to  many 
people.  Even  when  a  military  application  may  not  have  the  high  transaction  volume  as  one  in 
commercial  application,  the  response  may  be  a  lot  more  urgent.  Split  second  timing  may  be 
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necessary.  Thus,  it  is  imperative  that  we  should  see  how  such  demands  can  be  satisfied. 

Traditionally,  database  management  systems  are  designed  to  use  moderate  main  memory  space 
to  hold  the  necessary  data  from  secondary  storage  devices  and  use  software  to  process  the  data 
brought  in  from  storage.  As  software  speed  is  restricted  by  the  hardware  capability,  there  is  a 
hardware  limit  to  what  the  designer  and  implementer  can  do  to  make  the  system  faster.  How- 
ever, if  one  were  to  use  hardware  to  implement  some  of  the  database  functions,  an  increase  in 
performance  can  be  obtained.  This  is  the  approach  of  the  database  machine  research.  However, 
in  database  machines,  we  have  not  gone  far  enough.  We  have  not  tried  to  encapsulate  many  of 
the  database  processing  algorithms  in  hardware  form.  As  VLSI  techniques  have  now  been  so  far 
advanced  that  almost  all  algorithms  can  be  captured  in  chips,  we  should  take  advantage  of  that 
by  designing  our  system  in  such  a  way  that  we  can  interchange  functions  between  hardware  and 
software.  For  example,  we  can  have  a  computer  chip  that  does  nothing  but  sorting.  We  can  have 
another  computer  chip  that  does  only  index  search.  And  so  on  and  so  forth.  In  essence,  we  are 
saying  that  the  software  and  hardware  boundaries  are  no  longer  distinct  as  it  is  today.  With  such 
an  architecture,  we  will  be  able  to  take  advantage  of  hardware  advances  as  they  come  to 
increase  performance.  Naturally,  by  replacing  software  algorithms  in  hardware  form,  perfor- 
mance gains  can  be  tremendous. 

Another  area  where  hardware  advance  can  be  utilized  to  advantage  in  increasing  performance  is 
the  use  of  massive  memory  in  a  system.  In  recent  years,  main  memory  size  has  grown  by  leaps 
and  bounds.  It  is  forseeable  that,  in  the  future,  one  can  consider  it  feasible  to  have  a  substantial 
portion  of  the  databases  residing  in  the  main  memory.  It  is  this  kind  of  expectation  that  research- 
ers have  in  the  past  few  years  started  research  on  main  memory  databases  [DeWI84,  LEHM86a, 
LEHM86b].  In  their  works,  it  has  been  shown  that  other  than  the  gain  in  pure,  raw  speed  over 
secondary  storage  devices,  one  would  have  to  use  different  strategies  and  data  structures  in  the 
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DBMS  in  order  to  get  the  maximum  utilization  of  the  increased  memory  size.  In  fact,  pursuing 
further  on  the  use  of  main  memory  to  enhance  performance,  we  will  investigate  the  problem  of 
allocating  some  primary  data  in  the  memory  in  expectation  of  the  urgent  demands  as  discussed 
above  on  risk  processing.  Thus,  if  we  can  identify  certain  group  of  data  that  is  most  likely 
needed  to  process  the  high  risk  queries,  we  can  store  the  data  in  the  main  memory  all  the  time.  In 
this  way,  we  shall  be  able  to  satisfy  the  urgent  demands  when  they  occur. 

Just  storing  data  in  the  main  memory  alone  would  not  be  able  to  satisfy  the  critical  demand  of 
split  second  responses,  as  frequendy  there  is  too  much  processing  to  be  done  and  a  lot  of  data  to 
be  processed.  Having  hardware  embodiment  of  algorithms  as  just  proposed  would  help.  How- 
ever, when  the  data  volume  to  be  processed  is  massive,  one  would  need  additional  means  to 
achieve  our  goal.  For  example,  suppose  one  has  to  derive  an  answer  statistically  and  the  volume 
of  data  is  huge.  Is  there  something  that  we  can  do  to  help  the  system.  The  answer  is  positive. 
Under  certain  circumstances,  we  can  build  additional  structures  and  store  redundant  information 
to  facilitate  the  process.  The  paper  by  Srivastana  and  Lum  [SRIV86]  illustrates  this  point. 

Statistics  applications  in  general  require  the  evaluation  of  a  large  amount  of  data  to  derive  a 
value.  For  example,  finding  the  mean,  the  deviation  from  the  mean,  the  confidence  level  that  the 
result  is  meaningful,  etc.  are  some  of  the  common  calculations  in  statistical  applications.  Using 
only  the  raw  data  and  not  to  have  additional  stored  data  for  these  calculations  would  definitely 
be  time-consuming  and  requiring  a  lot  of  processing.  We  can,  however,  store  additional  statisti- 
cal values  as  suggested  in  that  paper.  In  this  way,  statistical  information  can  be  determined  at  a 
small  fraction  of  the  time  needed  otherwise,  and  queries  requiring  quick  and  urgent  responses 
can  be  thus  satisfied. 

Other  techniques  that  may  be  used  to  process  queries  requiring  split  second  responses  include 
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the  use  of  approximation  techniques  and  putting  out  answers  that  are  not  complete.  In  many 
scientific  displines  such  as  engineering,  the  practice  is  to  use  approximates  but  on  the  safe  side. 
In  current  database  processing,  we  do  not  do  anything  like  that.  In  actual  human  behavior,  it  is 
more  often  than  not  that  we  use  imprecise  information  and  satisfy  ourselves  with  mostly  approx- 
imate answers.  As  a  first  step,  if  not  the  last,  maybe  we  should  have  the  same  approach  in  the 
system  as  persons  actually  do.  Moreover,  partial  answers  are  not  necessarily  useless.  Fre- 
quently, people  can  extrapolate  from  the  partial  answer  to  arrive  at  a  useful  decision. 

VEL  MULTI-DATA  MODEL  DBMS 

In  the  past  years,  DBMS  have  become  used  in  many  diversified  applications.  It  is  so  pervasive 
that  almost  all  major  systems  have  in  it  a  DBMS.  Moreover,  as  there  is  not  any  'standard'  data 
model,  over  the  years,  DBMS  have  been  developed  to  have  a  number  of  data  models  and  inter- 
faces, including  query  languages.  Not  only  that  it  is  not  practical,  but  it  is  really  not  feasible  to 
think  that,  in  major  organizations  as  big  as  the  US  government,  or  even  just  one  branch  of  the 
armed  services,  all  its  DBMS  are  homogeneous.  In  fact,  it  is  expected  to  be  exactly  the  opposite. 
That  is,  users  cannot  afford  to  throw  the  systems  they  have  been  using  for  years  to  go  into  a  new 
system.  This  would  be  the  case  even  when  financially  and  politically  realizable,  because  realisti- 
cally it  is  not  possible  to  have  the  manpower  to  make  it  come  true.  Thus,  what  we  need  is  to  find 
a  way  that  the  various  DBMS  can  commuincate  with  each  other.  If  this  is  not  possible,  we 
should  have  at  least  in  some  place  a  DBMS  that  knows  how  to  integrate  data  obtained  from  the 
various  systems  with  different  models. 

Figure  2  illustrates  a  scenario  which  may  occur.  A  commander  is  situated  in  an  area  served 
direcdy  by  System  A.  However,  to  enable  him  to  make  his  decision  intelligently,  he  needs  infor- 
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mation  from  many  systems,  namely  System  B,  C,  and  D.  Now,  as  these  systems  contain  hetero- 
geneous DBMS,  running  with  different  hardware  and  software  configurations  and  using  different 
data  models,  it  is  necessary  to  find  a  way  so  that  information  can  be  obtained  at  the  site  of  Sys- 
tem A.  One  such  way  is  to  employ  a  multi-model  and  multi-lingual  DBMS  [DEMU87]  at  the  site 
of  System  A.  In  this  way,  the  user  in  System  A  may  write  database  transactions  for  the  other 
systems  and  obtain  information  from  them  to  be  converted  and  displayed  appropriately  at  the 
user's  location. 

Extrapolating  this  scenario,  one  can  see  that  it  may  be  advantageous  for  the  different  systems, 
namely  B,  C  and  D,  to  have  a  copy  of  the  muli-model  and  multi-lingual  DBMS  at  their  sites, 
coexisting  with  other  DBMS,  so  that  every  site  can  receive  and  send  information  to  others  as 
required. 


VHI.  CONCLUSION 

This  paper  describes  an  overview  of  a  DBMS  project  at  the  Naval  Postgraduate  School  initiated 
to  develop  an  advanced  DBMS  that  incorporates  many  new  concepts  that  are  not  yet  addressed 
in  today's  research.  The  concepts  of  risk  assessment  and  the  use  of  approximation  or  even  par- 
tial answers  in  certain  situations  to  answer  queries  are  definitely  new.  While  such  approaches 
may  not  be  completely  appropriate  in  commercial  applications  for  which  current  DBMS  have 
been  developed,  they  are  most  useful  in  other  applications  that  occur  in  our  normal  daily  life 
environment. 

Today's  DBMS  manages  massive  amount  of  data  in  the  business  world  very  well,  but  they  are 
not  well  suited  for  scientific  or  engineering  applications  and  not  so  useful  in  other  situations  such 
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as  military  applications.  This  occurs  because  the  emphasis  we  have  placed  in  the  past  is  to  have 
the  DBMS  handle  data,  but  not  information.  People  process  information,  but  the  systems  we 
developed  only  process  data.  Information  also  come  in  different  forms:  fixed-size  records,  text, 
graphics,  images,  voices  and  signals.  Advanced  DBMS,  thus,  should  have  a  capability  of  han- 
dling these  multimedia  information.  Researchers  in  AI  are  more  oriented  toward  the  develop- 
ment of  techniques  to  handle  information  (not  necessarily  multimedia  information,  though). 
Unfortunately,  as  information  is  very  difficult  to  be  precisely  defined,  advances  in  AI  research 
have  been  slow.  Since  DBMS  are  required  to  be  "smarter"  in  answering  queries  in  advanced 
application  areas,  techniques  developed  in  AI  for  processing  information  are  directly  relevant 
and  must  be  incorporated  into  advanced  DBMS  whenever  appropriate.  It  is  our  belief  that  AI 
techniques  are  one  of  the  most  important  ingredient  for  a  successful  realization  of  an  advanced 
DBMS. 

Since  it  is  not  possible  to  predict  something  that  has  not  been  developed,  and  since  it  is  not  prac- 
tical to  build  a  new  DBMS  whenever  new  techniques  are  found,  we  believe  that  we  must 
develop  an  architecture  for  the  DBMS  in  such  a  way  that  it  can  be  flexibly  and  broadly  extended 
to  incorporate  new  development  in  multimedia  processing,  inference  capalibities,  storage 
methods  and  others.  A  highly  extensible  architecture,  with  capability  to  handle  multiple  data 
languages  and  models,  much  beyond  that  has  been  conceived  in  other  research  programs,  is 
indispensable  for  realizing  an  advanced  DBMS. 

Our  report  has  identified  many  problems  and  a  rather  broad  scope,  even  though  we  considered 
only  a  subset  of  the  important  areas  that  are  pertinent  to  the  Command  Systems  Technology 
Block  Program.  Our  next  step  is  to  narrow  down  the  scope  further  to  arrive  at  a  better  focus  by 
studying  more  closely  the  applications  in  this  program.  Meanwhile,  we  shall  direct  our  efforts  on 
object-oriented  data  model,  user  interface,  and  intelligent  processing  capabilities  such  as  risk 
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assessment,  which  are  believed  to  be  basic  in  the  pursuit  of  developing  an  advanced  DBMS  to 
support  these  applications. 
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